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Abstract 


The  problem  of  estimating  the  frequencies  of  multiple  sinusoids  from  noisy  observations 
is  addressed  in  this  paper.  A  parametric  filtering  approach,  called  the  PF  method,  is 
proposed  that  leads  to  a  consistent  estimator  of  the  AR  representation  of  the  sinusoidal 
signal,  given  the  number  of  sinusoids.  It  is  accomplished  by  using  an  iterative  procedure 
to  find  a  fixed-point  of  the  parametrized  least  squares  estimator  (from  the  filtered  data) 
that  comprises  a  contraction  mapping  in  the  vicinity  of  the  true  AR  parameter.  Employ¬ 
ing  appropriate  filters,  this  method  is  able  to  achieve  the  accuracy  of  the  nonlinear  least 
squares  estimator,  with  much  less  computational  complexity  and  initialization  require¬ 
ment.  It  can  also  be  implemented  adaptively  (recursively)  in  order  to  track  time- varying 
frequencies.  In  this  way,  the  PF  method  provides  a  flexible  and  efficient  procedure  of 
frequency  estimation.  An  example  of  the  AR  filter  is  investigated  in  detail  to  illustrate 
the  performance  of  the  PF  method. 


Abbreviated  Title:  “Frequency  Estimation  by  Parametric  Filtering” 

Key  words  and  phrases:  Autoregressive,  contraction  mapping,  fixed-point,  frequency 
estimation,  iterative  filtering,  least,  squares,  parametric  filtering,  sinusoid,  spectral  anal¬ 


ysis. 


1  Introduction 


Let  {aq}  be  a  random  process  (signal)  consisting  of  q  superimposed  real  sinusoids  with 
unknown  frequencies,  as  given  by 

<7 

xt  l3kcc>s(u>kt  +  cj)k).  (1.1) 

k= l 

In  this  expression,  q  >  0  is  a  known  integer,  the  fa  and  t Ok  are  unknown  constants, 
satisfying  fa  >  0  and  0  <  uj-i  <  ■  ■  ■  <  uq  <  re,  and  the  <j>k  are  independent  and  uniformly 
distributed  random  variables  on  [0,  2tt).  Suppose  that  the  signal  is  contaminated  by 
additive  noise  so  that  the  observed  mixed-spectrum  process,  denoted  by  { yt },  can  be 
written  as 

yf'=xt  +  €t  (t  =  0,±1,±2,...).  (1.2) 

Assume  that  the  noise  {q}  is  independent  of  { <p^ }  and  can  be  modeled  as  a  linear  process 
of  the  form 

OO 

£<:=  E  U,}~IID(0,<7|),  21^1  <00.  (1.3) 

j—  CO 

Under  these  assumptions,  {yt}  becomes  stationary,  with  mean  zero  and  autocovariance 
function 


ryT  :=E(yt+Tyt)  =  r*  +  r'r, 

where  rxT  :=  E{xtjrTxt)  and  rcT  E{et+Tet)  are  autocovariance  functions  of  the  signal  and 
the  noise,  respectively,  and  can  be  written  as 

r*  =  Y.  \ Pi  cos (wfcr)  and  rT  =  cr|  ^  4’j+r^j-  (1.4) 

A.— t  j 

The  objective  of  frequency  estimation  is  to  find  estimators  of  the  sinusoidal  frequencies 
u>k  on  the  basis  of  a  finite  data  set  [y j, . . .  ,yn }  obtained  from  the  random  process  (1.2). 

Among  a  variety  of  different  procedures  of  frequency  estimation  in  the  literature, 
there  is  one,  known  as  the  AR  approach  (Kay  and  Marple,  1981),  that  resorts  to  the  AR 
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representation  of  the  sinusoidal  signal  under  investigation.  In  fact,  {.rt}  is  a  solution  of 

the  following  homogeneous  autoregressive  (AR)  equation  of  order  2q: 

2  <1 

Y  aj  xt-j  =  0.  (1.5) 

3=0 

The  AR  coefficients  aj  in  this  equation  are  defined  as  the  coefficients  of  the  polynomial 

•4(U  ==  nn  -  e”‘)(*  -  (1.6) 

3=0  k=  1 

It  is  clear  that  the  cij  in  (1.5)  completely  determined  by  04.,  and  vice  versa.  Moreover, 
the  cij  are  real  and  symmetric  in  the  sense  that 


a0  =  1  and  a2g-  j  =  a3  (j  =  0, . . . ,  q  —  1). 


(1.7) 


Therefore,  the  original  problem  of  frequency  estimation  can  be  equivalently  stated  as 
that  of  estimating  the  AR  parameter  vector  a  :=  [ai,  ■  •  •  ,  aq]T ,  where  the  superscript  T 
denotes  matrix  transposition.  This  reparametrization  enables  us  to  employ  many  well- 
studied  linear  methods,  which  usually  end  up  with  solving  systems  of  linear  equations. 
Once  an  estimate  of  a  becomes  available,  the  frequency  estimates  can  be  obtained  from 
the  zeros  of  A(z)  in  (1.6),  with  a,j  replaced  by  their  estimates. 

A  widely-used  procedure  for  estimating  the  AR  parameter  a  is  the  so-called  least 
squares  (LS)  estimator,  also  known  as  Prony's  spectral  line  estimator  (Kay  and  Marple, 
1981),  which  can  be  summarized  as  follows.  Due  to  the  AR  model  (1.5),  the  observed 
process  { yt }  satisfies  the  equation 

2  9  2  q 

Y  a3  iJi-3  ~  e-t  with  et:=Yajet-j •  (1-8) 

j= 0  j= 0 

For  a  given  time  series  {74, . . . ,  yn)  of  length  n  >  2q,  (1 .8),  together  with  (1.7),  gives 


y  =  -YQa  +  e. 


In  this  equation, 


y-2<i+ 1  +  y\ 

y-2q  ■  ■ 

2/2 

e2<?+l 

y  :  = 

!J  n  +  Vn-2q 

Y  := 

yn- 1 

•  Vn- 2q+\ 

e  := 

Cn 

(1.9) 


(1-10) 
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and  Q  is  a  (2 q  —  1  )-by-<y  matrix  of  the  form 


Q:= 


0T  1 

!  o 


where  I  stands  for  the  ( q  —  l)-by-(g  —  1)  identity  matrix,  I  the  ( q  —  1  )-by-(<?  —  1)  reverse 
permutation  matrix,  with  l’s  on  the  antidiagonal  and  0’s  elsewhere,  and  0  the  zero  vector 
of  dimension  q  —  1.  The  LS  estimator  of  a,  denoted  by  a^?,  is  defined  as  the  minimizer 
of  the  sum  of  squared  errors  ||y  +  YQa||2.  Simple  algebra  gives 


ans  =  -Yjy 


where  Y*  is  the  pseudo-inverse  of  YQ,  as  specified  by 


Y1  :=  (QTYTYQ)-]QrYr. 


(1.11) 


(1.12) 


The  AR  approach  is  computationally  attractive  as  compared  to  the  nonlinear  least 
squares  (NLS)  approach,  since  the  former  solves  a  system  of  linear  equations,  whereas 
the  latter  fits  the  data  with  a  sum  of  sinusoids  whose  amplitudes  and  frequences  are 
free  variables,  yielding  a  nonlinear  optimization  problem.  On  the  other  hand,  the  AR 
approach  has  also  some  serious  shortcomings,  among  which  is  the  inconsistency  of  aj_,s  as 
an  estimator  of  the  AR  parameter  a.  In  fact,  as  n  tends  to  infinity,  the  strong  ergodicity 
of  {lit}  gives 

n~lYTYaA-Ry  and  n~ *' Yry  a-4'  ftf  +  if 
(see  Li  and  Kedem,  1991),  where  Ry  and  r;/  are  defined  by 


»o 

y 

r- 1  '  ■ 

.  ry 

1  -27+2 

y 

r\ 

II 

>0h 

ry 

'0 

y 

‘  r-29+3 

and  ry 

y 

^2 

CN 

1 

==>CN 

_ 1 

y 

nq-3  ■  ■ 

y 

*0  J 

1 

iO<ci 

i _ 
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respectively,  with  the  superscript  B  standing  for  the  backward  rearrangement  of  a  vector. 
From  (1.9),  we  can  also  write 

r„  +  if  =  (n-2q)-lE(YTy) 

=  (n  -  29)-l{— £(YxY)Qa  +  E(YTe)j 
=  —  RyQa  T  f£  -f  rf  +  RLQa, 

where  Re  and  re  are  similarly  defined  from  {<t}  as  Rv  and  ry.  The  last  equality  employs 
the  identity  (n  —  2q)~1E(YTe)  —  r£  -f  rf  +  RcQa,  which  can  be  easily  verified  upon 
noting  that  the  vector  e  has  a  similar  representation  as  (1.9),  with  Y  and  y  replaced  by 
the  corresponding  matrices  from  {e(}.  Using  these  results  in  (1.11)  as  n  — ►  oo  yields 

aLS  a-4-  -R;'r,  =  a  -  R;x(re  +  Rfa),  (1.14) 

where 

Ry  :=  QrRyQ,  r-y  —  Q 7'(f„  +  rf )  =  2  Qrfi/, 

and  Re  and  re  are  defined  analogously  from  {e(}.  It  is  readily  seen  from  (1.14)  that  the 
limit  of  aLs  differs  from  the  parameter  a  being  estimated,  since  in  general  r£  +  Rca  ^  0. 
In  other  words,  the  above  AR  approach  produces  an  inconsistent  estimator  of  the  AR 
parameter  a.  It  is  not  surprising,  since  {e(}  in  (1.8)  is  not  white  so  that  {yt}  obeys  an 
ARMA  rather  than  an  AR  model.  As  will  be  seen  later,  this  problem  of  inconsistency 
can  be  resolved  by  a  parametric  filtering  approach,  which  we  call  the  parametric  filtering 
(PF)  method. 

2  Parametric  Filtering  Method 

To  alleviate  the  inconsistency  problem  of  the  AR  approach,  we  shall  consider  estimating 
the  AR  parameter  a,  not  directly  from  the  original  data,  but  from  the  filtered  data,  upon 
using  an  appropriate  parametric  filter.  It  will  be  shown  that  the  LS  estimator  from  the 
filtered  data  can  be  made  consistent,  by  “tuning”  the  filter  parameter  so  as  to  eliminate, 
in  the  filtered  data,  the  counterpart  of  the  vector  re  +  Rea  in  the  limit. 
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2.1  Definition  and  Algorithm 

Keeping  this  idea  in  mind,  let  us  first  introduce  a  linear  time-invariant  causal  filter,  whose 
impulse  response  sequence  is  denoted  by  {/ij(a),  j  =  0,1,...},  where  a  :=  [ai,  •  ■  •  ,  aq]T 
is  a  parameter  vector  that  takes  on  values  in  a  closed  and  bounded  subset  A  of  a  q- 
dimensional  Euclidean  space.  Assume  that  the  filter  is  BIBO-stable,  that  is,  |^j(a)l  < 
oo,  for  all  Let  {y((a)}  be  the  filtered  process  given  by 

oo 

y«(«)  :=  J2hj(a)yt-j, 

3=0 

and  define  the  filtered  signal  {.r((a)}  and  the  filtered  noise  {et(a)}  analogously. 

According  to  the  theory  of  linear  filtering  (see,  e.g.,  Brockwell  and  Davis,  1987),  the 
filtered  signal  {aq(a:)}  remains  a  sum  of  q  sinusoids  which  possess  the  same  frequencies 
as  the  unfiltered  signal  {aq}  in  (1.1),  with  possibly  different  amplitudes  and  phases.  As 

a  matter  of  fact,  aq(a)  can  be  expressed  as 

<? 

xt(a)  =  /3k(a)  cos [ukt  +  ^*(a)}, 

k=  l 

where  /^.(a)  and  are  amplitudes  and  phases  of  the  filtered  sinusoids,  as  specified 

by 

#t(a)  :=  fh  | H{tx>k\ a)|  and  ^.(a)  :=  (f>k  +  AII(ujk]  a)  mod  2ir, 
respectively,  and  H(u\ct)  is  the  transfer  function  of  the  filter,  as  defined  by 

OO 

H{u-,  a)  :=  hj(<x)  e~l3UJ- 

It  is  quite  clear  that  the  original  problem  of  frequency  estimation  is  not  altered  by 
the  linear  filtering  and  can  be  equivalently  stated  as  that  of  estimating  the  same  AR 
parameter  a  in  terms  of  the  filtered  processes. 

Now  let  aLs(a)  be  the  LS  estimator  of  a  on  the  basis  of  the  filtered  time  series 
{yi(ac), . . .  ,yn(at)}.  Since  the  strong  ergodicity  still  holds  for  {?/<(«)},  it  can  be  shown 
as  before  that  aps(a)  converges  almost  surely  to  the  vector 

a(a;)  := -R.;1(o!)r!/(a)  (2.1) 
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which,  similar  to  (1.14),  can  be  written  as 

a(tt)  =  a  -  R“x(q!)  |r£(a;)  +  Rc(o:)  a}.  (2.2) 

In  these  expressions,  the  autocovariances  are  defined  from  the  corresponding  filtered 
processes.  Multiplying  each  side  of  (2.2)  by  RJ/(a),  and  since  Ry(a)  =  Rr(a)  +  Re(a), 
we  obtain 

Rx(a)  a(a)  -f  R£(a)  a(a)  =  R^a)  a  -  r£(a). 

Assuming  for  the  time  being  that  Rj,.(a)  is  nonsingular,  we  obtain 

a  -  a(a)  =  R^(a)  {r£(a)  +  Rt(a)  a(a)}.  (2.3) 

Evidently,  if  there  exists  a  filter  parameter  a*  in  the  interior  of  A  such  that  R£(q;*)  is 
nonsingular  and 

a(a*)  =  -R,-1(a-)r,(a*),  (2.4) 

then  from  (2.3)  we  would  obtain  a(a*)  =  a.  This  implies  that  by  fixing  the  filter 
parameter  a  =  a;*,  the  corresponding  LS  estimator  aLs(c**)  becomes  consistent  for 
estimating  the  AR  parameter  a.  In  other  words,  the  requirement  (2.4)  implies  r£(o;*)  + 
R£(q;*)  a  =  r£(cr*)  +  Re(a*)  a(a*)  —  0. 

Suppose  that  the  autocorrelation  function  of  the  noise  {ej  is  known,  and  that  the 
filter  is  parametrized  so  that 

(Al)  a  =  —  R^1(a)  r£(a)  for  all  a  £  A. 

Under  this  assumption,  (2.4)  reduces  to  a  [a*)  —  a *,  which  indicates  that  the  desired 
filter  parameter  «*  is  a  fixed-point  of  the  mapping  a(cr).  Moreover,  since  a(cr*)  ==  a,  it 
also  follows  that  a*  —  a.  Therefore,  with  (Al)  being  satisfied,  it  is  no  longer  necessary 
to  distinguish  the  AR  parameter  a  and  the  desired  filter  parameter  cr*,  and  the  problem 
of  estimating  a  becomes  identical  to  that  of  estimating  ct*. 

It  is  interesting  to  note  the  similarity  between  (Al)  and  the  Yule- Walker  equations. 
In  fact,  for  an  AR(2ry)  process  with  AR  parameters  satisfying  (1.7),  the  vector  of  the  first 
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q  free  parameters  will  be  the  solution  of  (Al),  with  Rc(a)  and  r£(a)  being  replaced  by 
their  counterparts  defined  from  the  autocovariances  of  that  AR  process. 

Given  a  finite  sample  of  {y<},  let  a(ct)  be  a  consistent  estimator  of  a(ct).  Since  a*  is 
a  fixed-point  of  a(ar),  it  is  reasonable  to  estimate  a*  (=  a)  by  a  fixed-point,  denoted  by 
A,  of  the  random  mapping  a(a),  so  that 

a(A)  =  A.  (2.5) 

We  refer  to  this  procedure  of  frequency  estimation  as  parametric  filtering  method ,  or 
simply  PF  method.  The  fixed-point  A  is  called  PF  estimator  of  the  AR  parameter 
a(=  a*)  and  the  corresponding  angular  frequencies,  denoted  by  il'k,  of  the  zeros  of  A(z) 
in  (1.6),  with  A  in  place  of  a,  are  called  PF  estimators  of  u>k- 

In  order  to  find  the  PF  estimator  A,  a  simple  iterative  algorithm,  known  as  the  fixed- 
point  iteration  (FPI),  can  be  employed.  This  algorithm  generates  a  sequence  {Am}  of 
estimators  according  to  the  iterative  procedure 

Am  =  a(Am_])  (m  —  1,2,...).  (2.6) 

It  will  be  shown  later  that  under  certain  conditions  Am  converges  to  A  as  m  — *  oo  almost 
surely,  provided  that  the  sample  size  n  is  sufficiently  large. 

Note  that  (2.6)  can  be  regarded  as  an  iterative  filtering  procedure,  which  can  be 
implemented  in  two  steps  within  each  iteration: 

STEP  1.  Filter  the  data  with  the  filter  parameter  a  =  Am_i; 

STEP  2.  Compute  Am  =  a(Am_i)  from  the  filtered  data. 

Other  iterative  filtering  procedures  for  frequency  estimation  also  exist  in  the  literature 
(see,  e.g.,  in  Kay,  1984;  Dragosevic  and  Stankovic,  1989). 

2.2  Least  Squares  Estimator 

So  far  we  have  not  specified  the  estimator  a(c*).  In  fact,  any  estimator  would  qualify  as 
long  as  it  converges  (stochastically)  to  a(ct)  when  the  sample  size  n  tends  to  infinity.  In 
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the  following,  we  specialize  the  choice  of  a(ct)  by  considering  the  LS  estimator  from  the 
filtered  data. 

Notice  that  given  the  finite  data  {yl: . . . ,  j/„},  the  filtered  process  {?/<(«)}  can  only 
be  approximated  by 

yt(<*)  :=  J2hj(a)yt-j  =  (2.7) 

i= o 

Therefore,  the  estimator  a(ct)  should  be  obtained  from  {)/](«), . . .  ,yn(a)},  rather  than 
the  unavailable  data  {i/,  (a), . . . ,  ?/n(a:)}.  Enlightened  by  the  estimator  aLs(<*),  which 
depends  on  {y((o;)},  we  may  take 

a(a)  :=  -Yt(a)y(a),  (2.8) 

where  y(a)  and  Y^a),  defined  from  {^(cr)},  are  the  counterparts  of  y  and  Y*  given  by 
(1.10)  and  (1.12).  It  is  readily  shown  that  a(<x)  in  (2.8)  is  the  LS  estimator  that  minimizes 
the  criterion  ||y(a)  +  Y(a)  Q  a(a)||2.  It  will  be  shown  later  that  under  appropriate 
conditions  a(<x)  in  (2.8)  converges  almost  surely  to  a(c»r)  as  n  tends  to  infinity.  In  other 
words,  a(c*)  is  a  strongly  consistent  estimator  of  a(a).  This  strong  consistency  can  also 
been  shown  to  be  uniform  in  a  so  that  many  properties  of  a(a),  as  a  deterministic 
function  of  cr,  are  retained  by  its  estimator  a(a). 

2.3  Relation  to  the  CM  Method 

It  should  be  pointed  out  that  the  PF  method  presented  above  is  an  extension  of  a  fre¬ 
quency  estimation  procedure  recently  proposed  by  He  and  Kedem  (1990)  and  by  Yalkowitz 
(1991).  A  somewhat  more  systematic  and  rigorous  treatment  of  this  procedure  and  its 
statistical  properties  can  be  found  in  Li  and  Kedem  (1991),  and  Li,  Kedem,  and  Yakowitz 
(1991),  where  the  procedure  was  referred  to  as  CM  method  (or  contraction  mapping 
method).  In  essence,  the  CM  method  deals  with  the  case  of  a  single  sinusoid  (at  a 
time)  in  ambient  noise,  whereas  the  PF  method  extends  it  to  the  general  case  of  multiple 
sinusoids.  Indeed,  for  <7  =  1,  we  note  that  (Al)  becomes 

a  ~  — 2  ?’i(a)/?’o(ar)  (2.9) 
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and  the  LS  estimator  in  (2.8)  reduces  to 

n  n 

«(«)  =  -  {&(")  +  itt-M)  I  Yjtf-  i(q)- 

4= 3  4=3 

If  we  reparametrize  the  filter  by  setting  i)  :=  —a/2,  then  (2.9)  can  be  written  as 

where  pj(i?)  stands  for  the  first-order  autocorrelation  of  {^(a)}.  This  relation  is  read¬ 
ily  recognized  as  being  the  fundamental  property  required  by  the  CM  method  for  the 
parametrization  of  the  filter.  Moreover,  it  can  be  shown  (Li  and  Kedem,  1991)  that  under 
appropriate  conditions  p('0)  :=  —a(a)/ 2  is  a  consistent  estimator  of  the  first-order  auto¬ 
correlation  of  {yt(a)}.  With  this  estimator,  the  fixed-point  iteration  in  (2.6)  becomes 

dm  =  p(dm-i)  (m  =  1,2, ... ), 

which  coincides  with  the  iteration  of  the  CM  method  that  produces  a  sequence  {t?m}  for 
estimating  t?*  :=  —a*/2  —  cosuq. 

Statistical  properties  of  the  CM  method  have  recent  ly  been  studied  by  Li  and  Kedem 
(1991)  and  Li,  Kedem,  and  Yakowitz  (1991).  In  these  works,  it  was  proved  that  under 
appropriate  conditions  the  CM  method  provides  a  strongly  consistent  estimator  of  uq, 
and  that  the  estimator  is  asymptotically  normal  with  a  variance  inversely  related  to  the 
signal-to-noise  ratio  of  the  data.  In  the  next  section,  we  shall  analyze  the  PF  method 
along  the  same  lines  as  in  these  works  in  order  to  establish  the  strong  consistency  and 
asymptotic  normality  of  the  PF  estimator  O'. 

2,4  Nonsingularity  of  Autocovariance  Matrices 

To  end  this  section,  let  us  consider  the  conditions  under  which  R^ct)  and  Rf(a)  are  non¬ 
singular  so  that  the  inverse  matrices  in  (2.3)  and  (2.4)  are  well-defined  in  the  derivation 
of  the  PF  method.  For  this  purpose,  we  first  extend  /?/.(«)  and  cu/;  for  k  —  q  -f  1, . . . ,  2q 
by  defining 

/?29-it+i(tt)  :=  /4(«)  and  u>2q-k+ 1  :=  —Uk  {k  =  \ 
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In  so  doing,  we  can  write  the  autocovariance  function  r*(ct)  as 

rr(<*)  =  E  2 Pk{<*)  cos Hr)  =  E  {Pk(<*)4 

k= i  r— i 

with  Zk  :=  exp(iu4.).  It  is  not  difficult  to  verify  from  this  expression  that  R^a)  can  be 
decomposed  as 

R*(a)  -  QtSPShQ,  (2.10) 

where  the  superscript  H  denotes  the  Ilermitian  transpose.  In  this  decomposition,  P  is  a 
2q-by-2q  diagonal  matrix  of  the  form 

P  :=  f  diag{ft2(a),...,$g(a)} 

and  S  is  a  (2 q  —  l)-by-2('/  Vandermonde  matrix  as  specified  by 

S:=[slt...  ,s2,]  (2.11) 

with  s k  :=  [1  ,zlq~2]1.  Since  Q  is  of  full  column  rank  q  and  Sn  of  full  column 

rank  2 q  —  1,  the  decomposition  in  (2.10)  indicates  that  R^cc)  will  have  full  rank  q  if  P 
is  nonsingular.  It  is  easy  to  see  that  the  nonsingularity  of  P  is  guaranteed  if  /?*(ar)  >  0, 
or,  equivalently, 

(A2)  H(u>k\  a)  ^  0  for  all  h  —  1, . . . ,  q  and  oc  £  A. 

That  is,  the  filter  should  pass  all  frequencies  of  the  signal.  Therefore,  under  (A2),  R.J; ( a ) 
is  nonsingular  and  hence  the  derivation  of  (2.3)  is  valid.  Moreover,  the  nonsingularity 
of  Rc(a)  is  guaranteed  if  /'^(a)  =  var{t((a:)}  >  0  (see,  e.g.,  Brock  well  and  Davis,  1987, 
Proposition  5.1.1).  By  the  continuity  of  H{u>\  a)  in  u>,  ?•„(«)  =  0  if  and  only  if  H(u;  a)  = 
0  for  all  u  6  (—%,%].  Hence  Re(a)  is  also  nonsingular  under  (A2). 

3  Statistical  Properties  of  the  PF  Estimator 

To  investigate  statistical  properties  oi  the  PF  method  presented  in  the  preceding  section, 
we  shall  answer  the  following  questions: 
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i)  Under  what  conditions  does  the  random  mapping  a(ct)  have  a  fixed-point  a:? 

ii)  Under  what  conditions  does  the  fixed-point  iteration  in  (2.6)  converge  to  the  fixed- 
point  a?  and 

iii)  What  limit  and  limiting  distribution  does  the  PF  estimator  a  possess  as  the  sample 
size  n  tends  to  infinity? 

This  section  provides  a  set  of  sufficient  conditions  under  which  a  unique  fixed-point  a 
exists  and  can  be  found  in  the  vicinity  of  a*  by  the  fixed-point  iteration  almost  surely,. 
Under  these  conditions,  a  is  also  shown  to  be  strongly  consistent  and  asymptotically 
normal  for  estimating  ex*.  These  results  are  formulated  in  terms  of  the  LS  estimator 
a(cx)  defined  by  (2.8). 

3.1  Existence  and  Convergence 

Suppose  that  the  filter  also  satisfies  the  following  regularity  conditions: 

(A3)  There  exist  constants  c.j  >  0  with  YSfLojcj  <  00  such  that  \hj(a)\  <  Cj  for  all 
j  =  0,1,...  and  ot  £  A. 

(A4)  The  hj(ac)  are  continuously  differentiable  in  A  and  there  exist  constants  djk  >  0 
with  YIJLq  jdjk  <  oo  such  that  \dhfia.)  /  dotk\  <  djk  for  all  k  —  1, . . . ,  q,  j  =  0, 1, . . . , 
and  a  £  A. 

Under  these  conditions,  we  shall  first  show  that  the  random  mapping  a(a:)  is  uniformly 
consistent  for  estimating  a(«)  up  to  the  first  derivative,  as  summarized  by  the  following 
lemma. 

Lemma  3.1  Suppose  that  (A1)-(A4)  are  satisfied.  Then,  a(o:)  and  a(a)  are  continu¬ 
ously  differentiable,  and,  as  n  oo, 

a(a)  “A  a(a)  and  a'(a)  ^4'  a'(a) 

unifromly  in  ct  £  A,  where  afic*)  and  a'(c*)  are  Jacobian  matrices  of  a(c*)  and  a (») , 
respectively. 
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PROOF.  The  proof  resorts  to  some  basic  results  of  Li  and  Kedem  (1991),  concerning  the 
uniform  strong  consistency  of  sample  autocovariances  and  their  derivatives  of  the  filtered 
data.  In  particular,  according  to  Lemma  5.2  and  Remark  5.1  of  Li  and  Kedem  (1991), 
(A3)  guarantees  that 

n~1YT(a)  Y(a)  Ry(a)  and  n~l  YT(a)  y(a)  ^4'  fj,(ar)  +  r®(a) 

uniformly  in  ot  £  A  as  n  — >  oo.  Since  Ry  (o;)  is  nonsingular  under  (A2),  it  follows  that 
Y^(a)  is  well-defined  almost  surely  for  sufficiently  large  n  and 

a(a)  [i4'  -R;1(a)ry(a)  =  a  (a) 

uniformly  in  a  G  A.  Applying  Lemma  5.2  and  Remark  5.1  of  Li  and  Kedem  (1991)  again 

shows  that  a  (a)  and  a(ct)  are  continuously  differentiable  under  (A4)  and  a'(a)  ^4'  a'(a) 

uniformly  in  ot  €  A.  0 

In  the  sequel,  we  shall  also  make  use  of  the  following  lemma. 

Lemma  3.2  Let  o  be  the  spectral  radius  of  the  matrix  C(a*),  where 

C(a):=Ry-1(«)Rt(a).  (3.1) 

Then  (A2)  implies  g  <  1. 

PROOF.  It  is  easy  to  verify  from  (3.1)  that  C(a)  can  also  be  written  as 

C(a)={I  +  r(a)}-i  with  r(a):=Rt-1(a)Ri.(a).  (3.2) 

Let  fij  and  py,  (j  =  1  ,...,c/),  be  the  eigenvalues  and  corresponding  eigenvectors  of 
r(ct*),  then  1/(1  +  pj)  are  eigenvalues  of  C(a*),  associated  with  eigenvectors  p j.  By 
definition  (Ortega  and  Rheinboldt,  1970,  p.  43),  g  =  max{l/|l  +  pj |}.  Therefore,  g  <  1 
if  |1  +  pj |  >  1  for  all  j.  On  the  other  hand,  since  T(a*)pj  =  fijPj,  it  follows  from  (3.2) 
that 


pfR*(«*)P;  =  llj  {pfRt(«*)Pi}- 
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Note  that  R^ex*)  and  Re(cx*)  are  positive  definite  under  (A2).  Therefore,  we  obtain 
pRR^ex*)  pj  >  0  and  p^Rc(ex*)pj  >  0  for  all  j.  This,  in  turn,  yields  pj  >  0  for  all  j. 
As  a  consequence,  we  obtain  |1  +  pj\  =  1  +  pj  >  1  for  all  j  and  hence  g  <  1.  O 

Based  on  these  lemmas,  the  existence  of  the  PF  estimator  a  as  a  fixed-point  of  a(ex) 
and  the  convergence  of  FPI  to  d  can  be  established  as  follows. 

Theorem  3.1  Under  (Al)~(A4),  the  following  assertions  hold  almost  surely,  provided 
that  n  is  sufficiently  large. 

a)  There  exists  a  neighborhood  S&(cx*)  {ex  :  |[ex  —  ex*||  <  A)  of  ex* ,  with  A  being 
independent  of  n,  in  which  the  random  mapping  a(ex)  has  a  unique  fixed-point  d. 

b)  The  sequence  {dm}  produced  by  (2.6)  converges  to  d  as  m  — >  co  if  d0  €  65(d), 
where  65(d)  C  6a  ( ex  * )  is  a  neighborhood  of  a,  with  8  being  independent  of  n. 

PROOF.  Notice  that  (2.2)  yields  a(tx)  —  a(ex*)  =  —  R~1(ex)  {rc(ex)  +  R£(ex)a).  Since 
(Al)  is  equivalent  to  a  —  —  Rc_1(a)  rc(a),  it  is  easy  to  verify  that 

a(<x)  —  a(a*)  =  R"1  (a)  Re(a)  (ex  —  a)  =  C(ex)  (a  —  a*).  (3.3) 

It  follows  from  the  continuity  of  C(<x)  that  the  Jacobian  matrix  of  a(ex)  at  ex*  is  given 
by  a'(ex*)  =  C(ex*).  By  Lemma  3.2,  the  spectral  radius  of  a'(ex*)  is  strictly  less  than 
1.  Thus  (see,  e.g.,  Ortega  and  Rheinboldt,  1970,  Theorem  2.2.8,  p.  44),  there  exits  a 
norm  ||  •  ||  such  that  ||a'(a*)||  <  1.  Furthermore,  the  continuity  of  a'(ex)  and  C(a:)  also 
guarantees  the  existence  of  a  constant  0  <  c  <  1  and  a  neighborhood  6a(«*)  :=  {a  : 
11°  —  c**||  <  A}  C  A  such  that 

||a,(Q:)||  —  c  anf^  ll^(a)ll  —  c  (^-4) 

for  all  ex  E  6a(«*)-  Let  k  :=  (c  +  l)/2  <  1,  then,  for  any  0  <  A0  <  A,  the  uniform 
convergence  of  a'(ex)  to  a'(ex)  implies  that  ||a'(ex)||  <  k  almost  surely  for  all  ex  E  6a0(o:*), 
provided  that  n  is  sufficiently  large.  On  the  other  hand,  using  the  mean-value  theorem 
(Ortega  and  Rheinboldt,  1970,  p.  71)  we  can  show  that  almost  surely 

||a(aj)  -  a(ex2)|[  <  k  ||«i  -  a2||  (3-5) 
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for  all  aq,  <*2  €  Na0(o:*),  that  is,  the  random  mapping  a(c*)  is  contractive  on  Sa0(&*) 
almost  surely,  provided  that  n  is  sufficiently  large.  In  addition,  the  convergence  of  a(a) 
to  a(ct),  as  shown  in  Lemma  3.1,  guarantees  that 

l|a(«*)  -  «*ll  =  ||a(a*)  -  a(a*)||  <  (1  -  k)A0 

almost  surely  for  sufficiently  large  n.  Therefore,  according  to  Theorem  5.2.3  of  Stoer  and 
Bulirsch  (1980),  the  mapping  a(a)  has  a  unique  fixed-point  d  on  Sa0(oc*).  Since  Ao  is 
arbitrary,  d  must  be  a  unique  fixed-point  of  a(q)  in  the  interior  of  S&(a*).  Part  a)  of 
the  theorem  is  thus  proved.  To  show  Part  b),  we  note  that  for  any  neighborhood  Ss{oc) 
of  the  fixed-point  a,  if  Sg(a)  C  S&(ct*),  then  (3.5)  remains  valid  almost  surely  for  all 
aq,  (X2  G  Ss(oc),  that  is,  a(a)  is  almost  surely  a  contraction  mapping  on  5V(d),  provided 
that  n  is  sufficiently  large.  By  Theorem  5.2.2  of  Stoer  and  Bulirsch  (1980),  the  sequence 
{dm}  produced  by  (2.6)  converges  to  d  as  m  — >  oo  almost  surely,  if  n  is  sufficiently  large 
and  the  initial  value  do  is  chosen  in  .Ss(d).  <) 

REMARK.  Theory  of  numerical  analysis  (see,  e.g.,  Stoer  and  Bulirsch,  1980)  tells  us 
that  the  spectral  radius  q  of  C(a*)  is  crucial  to  the  rate  of  convergence  of  FPI.  Indeed, 
the  smaller  the  spectral  radius  g  is,  the  faster  is  the  convergence  of  FPI  to  d.  As  seen 
in  the  proof  of  Lemma  3.2,  g  =  1/(1  +  where  //,,njn  :=  rnin{//;}.  Therefore,  to 

accelerate  the  convergence  of  FPI,  plu-m  should  be  made  as  large  as  possible.  Notice 
that 

_  .  p//RJ(a*)p 

^  111111  p^o  pwRt(q*)p 

In  the  case  of  q  =  1,  recudes  to  ?'o(a*)/rQ(a*),  which  is  readily  recognized  to  be 
the  signal-to-noise  ratio  of  the  filtered  process  {?/((q)}  with  the  desired  filter  parameter 
oc  —  ex* .  For  q  >  1,  can  be  regarded  as  a  generalized  indicator  of  the  amount  of 
signal  relative  to  the  amount  of  noise  in  the  filtered  process  {^(a*)}. 
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3.2  Strong  Consistency  and  Asymptotic  Normality 

Suppose  that  A  is  the  fixed-point  of  the  random  mapping  a(cr)  in  the  vicinity  of  a*.  In 
the  following,  we  shall  investigate  asymptotic  properties  of  A  as  the  sample  size  n  tends 
to  infinity.  The  following  theorem  claims  the  strong  consistency  of  the  PF  estimator  A 
for  estimating  a*. 

Theorem  3.2  Suppose  that  (Al)-(A4)  are  satisfied,  and  let  A  be  the  unique  fixed-point 
of  a(cr)  in  S&(ct*),  as  given  by  Theorem  3.1.  Then  A  converges  to  oc*  almost  surely  as 
n  — >  oo. 

Proof.  Since  a(A)  =  A  and  a(a*)  —  a*,  it  follows  from  (3.3)  that 

A  —  ex*  =  £a(A)  +  a(A)  —  a(ct*)  =  <?>a(A)  +  C(A)  (A  —  oc*)  (3-6) 

where  <$a(a;)  :=  a(cr)  —  a(a).  By  (3.4),  we  have  ||C(o:)|j  <  c  <  1  for  any  a  €  S&(cx*). 
It  turns  out  that 

ll«  -  «*||  <  l|£a(«)ll  +  ||C(a)||  || A  -  a*||  <  ||Aa(A)||  +c||A  -  a*|| 
and  hence 

0  <  (1  -c)||A  -a*||  <  ||£a(A)||  (3.7) 

almost  surely  for  large  n.  The  uniform  convergence  of  a(cr)  to  a(a),  as  shown  in 
Lemma  3.1,  guarantees  that  || Aa(  A) |[  ^4'  0  as  n  — >  oo,  which,  together  with  (3.7),  proves 
the  assertion.  <C> 

The  asymptotic  normality  can  also  be  established  for  the  PF  estimator  A.  To  this 
end,  we  first  need  the  following  lemma,  concerning  the  asymptotic  normality  of  a(cr*). 

Lemma  3.3  Suppose  that  (A1)-(A3)  are  satisfied.  If  in  addition  E(£f)  <  oo,  then,  as 
n  — »  oo,  y/n 8a(cx*)  —  y/n{a(o:*)  —  a}  converges  in  distribution  to  a  normal  random 
vector  with  mean  zero  and  covariance  matrix 

V  :=  Ry1(«*)QrW(a*)QR"1(a*) 
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where  W(a*)  :=  \wij{cx*)\  and 


I 


2  q 


Wij{a)  :=  4  1  J2  akrcT+i_k(a *) 

r=0  f  fc=0 


5Zfl*rT+i-fc(«*)  l 

fc= 0  J 


fori,j  =  1,  —  ,2  9-  1. 


(3.8) 


PROOF.  The  proof  utilizes  some  basic  results  of  Li,  Kedem,  and  Yakowitz  (1991),  con¬ 
cerning  the  asymptotic  normality  of  sample  autocovariances  of  the  filtered  data.  First  of 
all,  under  (Al),  it  is  easy  to  sec  that  (2.8)  yields 

<$a(a*)  =  -{(QrY7  YQ)-1QrYry  +  a}. 


Here,  as  well  as  in  the  following,  the  argument  a*  is  omitted  in  data  and  autocovariance 
matrices  for  the  sake  of  brevity.  Moreover,  by  following  the  same  lines  as  the  proof  of 
Theorem  3.1  of  Li,  Kedem,  and  Yakowitz  (1991),  we  can  write 


n 


"'QtYtYQ  =  R  +  oP{n-1'2)  and  n-xQTYTy  =  f  +  oP(n“1/2), 


where  R  :=  Q1  RQ,  r  :=  QT(r  +  rB )  =  2  Q7  r,  with 


and 


R  :  = 


*"5  > 
o 

•  r-2q+2 

ri 

, 

1'2q-2  ’ 

ro 

r 

to 

1 

\ _ 

n-j 


-1 


(j  =  0, 1, . . .  ,2<jf  —  1). 


t= 1 


Therefore,  it  follows  from  (3.9)  that 


^a(a*)  =  —  (R  ‘f  +  a)  +  op(n  1 /f2) 

=  -R-1(r  +  Ra)  +  oP(n“1/2). 


(3.9) 


(3.10) 


Let  tj  i for  brevity.  Then,  according  to  Theorem  3.1  and  Remark  3.3  of  Li, 
Kedem,  and  Yakowitz  (1991),  y/ii  { fj  —  rj } ,  (j  =  0, . . . ,  2q  —  1)  are  asymptotically  jointly 
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normal  with  mean  zero  and  covariance  matrix  Vr  :=  [vij],  where 

q  oo 

Vij  :=  2  cos(iu>k)  cos(ju>k)  rcT(a*)  cos(Tivk) 

k= 1  r=-oo 

+  (d  -  3)  r- (a*)  r](a*) 

OO 

+  e  <3-n> 

T  — —  OO 

for  i,  j  =  0,1,...,  2^  —  1 ,  and  /i  :=  £{e^(a*)}/i?{£^(a*)}  <  oo.  On  the  other  hand,  the 
vector  ?  +  Ra  in  (3.10)  can  be  regarded  as  the  value  of  some  function  f(Co?  •  •  • ,  (2^-1)  at 
( j  =  fj,  that  is,  r  +  Ra  =  f(r0, . . . ,  f2(?_i).  For  this  function,  it  is  also  true,  by  (2.1), 
that 

f(r0,...,r2,_i)  =  r„(a*)  +  Ry(o:*)a 

=  ry(ad)  +  Ry(a;*)  a(cU)  =  0. 

Therefore,  invoking  Proposition  6.4.3  of  Brockwell  and  Davis  (1987)  proves  that 

\/n  (r  +  Ra)  =  s/n  {f(?’0,  •  •  • ,  ^2,-1)  -  f(r0, . . . ,  r2,_i)} 

converges  in  distribution  to  a  normal  random  vector  with  mean  zero  and  covariance 
matrix  V/  :=  FVrFr,  where  F  is  the  Jacobian  matrix  of  f  evaluated  at  (r 0, . . .  ,r2?_i). 
It  is  easy  to  verify  that 

r  +  Ra  =  Q7  (r  +  rB  +  RQa)  =  QT[f  :  R  :  rB]  a, 

where  a  :=  [aq,  «i,  •  •  •  ,  a2(,]7  .  Simple  algebra  shows  F  =  [f0,  •  •  •  , f2(/_i],  where 

o,\  a\-j  T  «i  +j 

fo  =  Q7  :  and  f )  —  Q2  : 

a2q-l  0-2q-l-j  +  U2q-l+j 

for  j  —  1,. . .  ,2 q  —  1,  with  ak  :=  0  tor  k  <  0  and  k  >  2 q.  Upon  noting  that  vtJ  given  by 
(3.11)  are  symmetric  in  the  sense  that  =  uti_j  =  =  vi:i ,  we  can  rewrite  V/  as 

V  f  —  Q2  BQ,  where  B  :=  [&,•_,•],  (i,j  =  1,. . .  ,'2q  —  1),  with 

2g 

bij  . —  )  )  Vi—k,j—i  n/ . 
k,  1=0 
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As  can  be  seen  from  (3.11),  there  are  three  groups  in  the  expression  of  Vij.  The  first 
group  involves  the  sinusoidal  terms,  all  of  which  are  cancelled  out  in  the  expression  of 
bij,  because  J2ilo  ai~kl  =  0  for  k  =  1, . . . , q  and  hence 

2  g 

E^cosKO'  -  /))  =  0 
1=0 

for  j  =  l,...,2q  —  1  and  k  —  1  The  second  group  in  (3.11)  consists  of  v'tJ  := 

(fi  —  3)  r\ (a*)  It  is  easy  to  see  that  the  corresponding  term  in  V j  can  be  written 

as  ( fi  —  3)  UUT,  where 

U  :=  QT[rt(a*)  !  R£(a*)  iff  (a*)]  a 

=  r£(a*)  +  Rt(a*)  a  =  r£(ad)  +  R£(<x*)  a*. 

Under  (Al),  U  =  0  and  hence  that  term  in  V/  vanishes.  Combining  these  results,  we 
obtain  V;  =  Q7B0  Q  where  B0  :=  [U(J]  with 

co  2</ 

bij:=  E  £ 

T  =  -co  k,  1=0 

Furthermore,  it  is  not  difficult  to  verify  that  B0  can  be  written  compactly  as 

CO 

B0  =  E  ATrr(a*)r?(a*)(A  +  iA), 

T—  —  OG 

where  A  is  a  (4 q  —  l)-by-(2<j  —  1)  matrix  of  the  form 

a-2q  0 

a  o  a-2q 

0  a0 

and  rr(a*)  :=  [7''_2l?+1(a*),  ■  •  ■  ,  r^.+2g_1(Q;,‘)]r.  Since  IA  =  AI  and  IQ  =  Q,  we  obtain 
IAQ  =  AQ.  Therefore, 

oo 

v,  =  2  £  QTAV(a-)r^(a*)Aq. 


IS 


This  expression  can  be  further  simplified  as 

OO 

V/  =  4  £  QTA7  rT(«*)  if  (<**)  AQ  =  QTW(a*)  Q, 

r=0 

upon  noting  that  r_T(o:)  =  Irr(a:),  QrArro(o!*)  =  U  =  0,  and 

OO 

W(a*)  =  4  ATrT(o!*)  r^(a*)  A. 

T— 0 

Finally,  since  R  ^4'  RJ/(a*),  by  Slutsky’s  theorem  (Lehmann,  1982,  Lemma  4.1,  pp.  432- 
433)  we  obtain  from  (3.10)  that  \/n6a(ac *)  =  — R_1v/n(r  +  Ra)  +  op(l)  converges  in 
distribution  to  N(0,V).  0 

With  the  aid  of  this  lemma,  we  are  now  able  to  show  the  asymptotic  normality  of  the 
PF  estimator  A. 

Theorem  3.3  Suppose  that  the  conditions  in  Theorem  3.2  are  satisfied  and  that  E(£f)  < 
oo.  Then,  as  n  — ►  oo,  y/n(6c  —  cv*)  converges  in  distribution  to  a  normal  random  vector 
with  mean  zero  and  covariance  matrix 

V«  =  Rj1(a*)QTW(a*)QRj1(a*)  (3.12) 

where  W(a*)  is  defined  in  Lemma  3.3. 

PROOF.  It  follows  from  (3.6)  that  {I  —  C(A)}  (A  —  a*)  =  <5a(A).  By  the  mean-value 
theorem,  <5a(A)  can  be  written  as 

<5a(A)  —  <5a(a:*)  +  8a'(a*  +  A(&  —  a*))  e/A (a  —  a*) 

where  Sa'((x)  is  the  Jacobian  matrix  of  8k(a).  Since  8a'(a)  =  a'(a)  —  a'(a)  0 

uniformly  in  ol  £  A,  as  guaranteed  by  Lemma  3.1,  we  have 

[  8a'(ct*  +  A(A  —  a:*))  d\  *^4'  0. 

Jo 

In  addition,  the  consistency  of  A,  together  with  the  continuity  oi  C(c*),  implies  that 
C(A)  ^4  C(ci!* ).  Therefore,  by  Slutsky’s  theorem,  \fn  (A  —  a*)  has  the  same  asymptotic 
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distribution  as  ^Jn  {I  —  C(cr*)}“1^a(a:*).  Invoking  Lemma  3.3  shows  that  A/n(a  —  a*) 
converges  in  distribution  to  N(0,  Va),  where  Va  :=  {I  — C(a*)}_1  V  {I— C(a*)}_1.  Fur¬ 
thermore,  using  the  expression  (3.2)  of  C(a)  and  applying  the  matrix-inversion  formula, 
we  can  write 


{I  -  C(a)}"1  =  I  +  r-1(a)  =  r-\a)  {I  +  T(a)}. 

On  the  other  hand,  Ry(a)  =  Rx.(a)  +  R£(q:)  =  R£(a)  {I  +  r(o:)}.  Therefore,  we  obtain 
{I  — C(a)}_1R;/‘1(a)  =  r_1(a)  R~x(a)  =  R~1(a).  Substituting  this  result  in  the  above 
expression  of  Va  completes  the  proof.  <)■ 

4  Extension  to  Complex  Sinusoids 

A  parallel  theory  of  the  PF  method  can  be  easily  established  for  the  case  of  complex 
sinusoids  in  additive  noise.  In  fact,  if  the  signal  {aq}  is  a  sum  of  p  complex  sinusoids  as 
given  by 

xt  :=  /3k 

k= 1 

with  0  <  u)\  <  •  •  •  <  iOp  <  2r,  then  it  satisfies  a  pth-order  AR  autoregressive  equation  of 
the  form 

p 

aj  xt-j  —  o 
i=o 

where  the  AR  parameter  vector  a  :=  [aj,--  -  ,aP]T  is  defined  by  the  coefficients  of  the 
polynomial 

j2aJzP~3  =  n  (*-**) 

j=0  k= 1 

with  Zk  exp (iuk).  Given  a  finite  data  set  (yi, . . .  ,y„}  observed  from  (1.2),  one  ol  the 
widely-used  estimators  of  the  AR  parameter  a  is  given  by 

aLS  :=  —(YHY)~1YHy  (4.13) 
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where  Y  and  y  are  redefined  by 


Vp  ■■■  V 1  VP+ i 

Y  :=  :  ;  and  y  :=  j 

Vn—i  '  '  '  yn—p  Vn 

This  estimator  is  known  as  the  forward  linear  prediction  (FLP)  (Kay  and  Marple,  1981) 
which  minimizes  the  criterion  ||y  +  Yaj|2.  Other  procedures  such  as  the  forward-backward 
linear  prediction  (FBLP)  (Kay  and  Marple,  1981)  are  also  applicable  for  estimating  the 
AR  parameter  a  with  only  a  slight  modification  of  Y  and  y  in  (4.13). 

Introducing  a  parametric  fdter  {/ij(a:)}  indexed  by  the  parameter  a  :=  [on,  ■  •  •  ,  ap]T , 
an  estimator  a(cr)  of  the  AR  parameter  a  can  be  obtained  according  to  (4.13),  with  the 
data  matrices  replaced  by  those  of  the  filtered  data  {yf(a)}  in  (2.7).  The  assumption 
(Al)  retains  its  form,  but  R£(q;)  and  r£(a)  are  now  of  the  structure 


(4.14) 


with  the  autocovariances  obtained  from  the  filtered  noise  {e((o;)}.  In  this  case,  (Al) 
is  readily  recognized  as  being  the  Yule- Walker  equations.  In  other  words,  for  complex 
sinusoids,  (Al)  can  be  interpreted  as  the  requirement  that  the  filter  be  parametrized  so 
that  the  parameter  a  satisfis  the  Yule- Walker  equation  for  the  filtered  noise.  With  this 
property  being  fulfilled,  the  PF  estimator  &  is  defined  as  the  fixed-point  of  the  random 
mapping  a(o;)  and  can  be  obtained  by  FPI  in  (2.6). 


5  The  AR  Filter 

Although  its  consistency  is  guaranteed  by  the  asymptotic  theory  as  developed  in  Sec¬ 
tion  3,  the  accuracy  of  the  PF  estimator  depends  on  the  choice  of  the  parametric  filter 
to  be  applied  to  the  data.  Intuitively,  a  “good”  filter  should  be  bandpass,  so  that  the 
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sinusoidal  signal  can  be  enhanced  and  the  noise  be  eliminated  as  much  as  possible.  A 
useful  example  of  such  a  filter  is  the  AR  filter  that  will  be  considered  in  detail  as  an 
illustration  of  the  PF  method. 

The  AR(2<?)  filter  (or  simply  the  AR  filter)  is  a  parametric  filter  defined  recursively 
by 

Vt(oc)  +  t»i(a)  7]  yt-i(at)  +  •  •  •  +  02q(a)  i]2q  yt-2q(ot)  =  yu  (5.1) 

where  0  <  r;  <  1  is  a  parameter  that  controls  the  bandwidth  of  the  filter,  and  the  Oj(a ) 
are  coefficients  of  the  filter,  depending  on  a  and  being  symmetric  in  the  sense  that 

02,-j(a)  :=  Oj(a)  (j  =  0, 

with  0o(a)  :=  1.  This  filter  was  employed  by  Dragosevic  and  Stankovic  (1989)  to  estimate 
multiple  frequencies  with  the  specific  choice  of  the  filter  coefficients  0j(at)  =  ctj.  A  similar 
filter  was  also  used  by  Kay  (1984)  for  estimating  complex  sinusoids. 

Let  us  assume  that  the  additive  noise  {et}  in  (1.3)  is  white  with  zero  mean  and  finite 
fourth  moment.  It  will  be  shown  that  in  this  situation  a  very  simple  relationship  between 
{f4(a:)}  and  a  can  be  established  according  to  (Al)  and,  by  appropriately  selecting  oc 
in  a  parameter  space,  theoretical  results  obtained  in  previous  sections  apply. 

5.1  Parametrization 

Since  the  noise  {et}  is  assumed  to  be  white,  it  is  easy  to  verify  that  the  autocovariance 
function  of  the  filtered  noise  satisfies  the  equation 

Y^Vk  0k{ac)i-cT-k{a)  =  0  (r  =  1,2, ... ),  (5-2) 

k— 0 

or,  in  matrix  form, 

f£(a:)  +  ?/2?  rf  (a)  =  -Rc(aO  Q  0(a).  (5.3) 
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In  the  expression, 


9(a)  — 

Oi(a) 

Q  := 

Qi  0 

0r 

0q(a) 

Q2  o 

with  Qi  diag(?/, . . . ,  rjq  *)  and  Q2  :=  ?/9QiI.  By  prior-multiplying  each  side  of  (5.3) 
with  2  Qr,  we  obtain 

(1  +  r,2?)  re(a)  =  -2  QrR c(a)  Q  0(a). 


Therefore,  (Al)  requires  that 


«(«)  =  |(1  +  ^){QiRe(a)Q)-1Rc(a)a. 


(5.4) 


On  the  other  hand,  simple  algebra  shows  that 


QT*M“)  Q  =  Qt-'  =  R<(“) T 

where  TI;  is  a  q-hy-q  diagonal  matrix  of  the  form 

T  ,  ...  A  i  +  v21'  i 

n  \7/  +  7/2,;_1  ’  ’  llq~1  +  7??  +  1  ’  2  7/9  y 

As  a  result,  (5.4)  is  simplified  to  a  trivial  linear  equation 


n— 1 
v  ’ 


9(a)  =  Tna. 


(5.5) 


With  the  coefficients  given  by  (5.5),  the  AR filter  in  (5.1)  satisfies  (Al),  so  that  a  sequence 
of  estimators  {am}  can  be  produced  according  to  FPI  in  (2.6). 

As  compared  to  (5.5),  the  parametrization  in  the  generalized  least  squares  (GLS) 
method  of  Dragosevic  and  Stankovic  (1989)  is  given  by 

0(a)  =  a.  (5.6) 


Notice  that  for  tj  <  1,  (5.5)  differs  from  (5.6).  It  is  this  difference  that  makes  the  PF 
estimator  consistent  for  any  ?/  <  1,  while  the  GLS  estimator  is  inconsistent.  Note  also 
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that  PF  and  GLS  coincide  when  ij  —  1.  The  iterative  filtering  algorithm  (IFA)  proposed 
by  Kay  (1984)  corresponds  to  the  complex  version  of  PFAR  with  ?/  =  1.  However,  Kay 
used  the  Burg  estimator  instead  of  the  least  squares  in  order  to  guarantee  the  stability 
of  the  filtering. 

In  the  special  case  of  q  —  1,  (5.5)  reduces  to 

0(°o=F^a- 

I'l 

By  this  choice,  (5.1)  becomes  a  variant  of  the  AR(2)  fdter  discussed  by  Li  and  Kedem 
(1991)  and  Li,  Kedem,  and  Yakowitz  (1991).  Furthermore,  if  rj  =  1,  it  coincides  with  the 
procedure  proposed  by  Quinn  and  Fernandes  (1991)  and  Truong- Van  (1990). 

5.2  Parameter  Space 

In  order  for  the  AR  filter  to  possess  the  properties  (A3)  and  (A4),  the  parameter  a 
cannot  be  arbitrarily.  Instead,  it,  must  lie  inside  a  parameter  space.  A  possible  choice  of 
such  a  parameter  space  is  presented  as  follows. 

Let  0  be  the  collection  of  6  :=  -  ,0q]T  for  which  all  zeros  of  the  polynomial 

z2q~k  are  distinct  and  on  the  unit  circle  in  the  complex  domain,  that  is, 

i20kz2q-k=  f[(z  -  e^)(z  -  e-^) 
k= 0  k=  1 

for  some  0  <  Ai  <  •  •  •  <  \q  <  n,  where  0o  :=  1  and  02 q-k  ’■=  Of. ,  (k  =  0, . . . ,  q  —  1).  Define 
the  parameter  space  A(tj)  to  be  the  collection  of  a  such  that  0(a)  =  T,,  a  €  0,  that  is, 

A(q)  (a  :  T,ae  0}. 

Since  Aj  take  on  values  in  an  open  region,  both  0  and  A(r/)  are  also  open  in  the  q- 
dimensional  space.  Moreover,  it  is  readily  shown  that  the  poles  of  the  AR  filter  with 
6(a)  given  by  (5.5)  occur  on  the  circle  \z\  =  p  if  a  £  A(r]).  Therefore,  with  rj  <  1  and 
a*  6  A(rj),  the  AR  filter  satisfies  (A3)  and  (A4)  for  some  bounded  and  closed  subset 
A  C  A('tj)  that  contains  a*  in  its  interior. 
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In  practice,  it  is  not  guaranteed  that  arn  will  always  appear  inside  A(rj).  In  case  of 
falling  outside,  am  has  to  be  projected  back  into  A(p)  in  order  to  satisfy  the  conditions  of 
the  PF  method.  Suppose  that  for  a  given  a,  the  zeros  of  the  polynomial  YfkLo  0 k(a )  z2q~k 
are  of  the  form  pk  exp(±iAfc),  (k  =  1,...,</),  for  some  pk  >  0  and  0  <  Aj  <  •  •  •  < 
Ag  <  7r,  where  {^.(o;)}  are  defined  by  (5.5).  Then,  the  projection  of  0(a)  on  0  is 
obviously  0(a)  :=  [^(cc),  •  •  •  ,  0q(a)]T ,  where  0k(a)  are  determined  by  the  coefficients  of 
the  polynomial 

E  0k(a)  z2q~k  =  f[(z-  eiXk)(z  -  e~iXk). 
k=0  k- 1 

By  this  operation,  we  simply  force  the  poles  of  the  AR  filter  to  occur  on  the  circle  \z\  =  rj. 
As  a  result,  the  projection  of  a  on  A(p),  denoted  by  a ,  can  be  written  as 

a  =  T;1  0(a)  e  A(V). 

5.3  Statistical  Properties 

For  the  properties  in  Section  3  to  hold,  (A2)  is  the  only  thing  left  for  verification.  It  is 
readily  shown  that  the  AR  filter  satisfies  this  condition,  since  its  transfer  function 

H{u-,a)=  I  £ 

(fc=0 

is  nonzero  for  all  a  £  A(r/). 

In  conclusion,  the  AR  filter,  defined  by  (5.1)  and  (5.5)  with  ?/  <  1,  satisfies  (Al)- 
(A4).  Consequently,  the  corresponding  PF  estimator  cv,  called  the  PFAR  estimator, 
possesses  the  statistical  properties  in  Section  3,  regarding  the  existence,  convergence, 
strong  consistency,  and  asymptotic  normality,  as  can  be  summarized  as  follows. 

Theorem  5.1  Suppose  that  {c<}  is  white  with  finite  fourth  moment.  Then,  for  the  AR 
filter  defined  by  (5.1)  and  (5.5)  with  p  <  1,  the  results  in  Theorem  3.1,  Theorem  3.2,  and 
Theorem  3.3  are  valid,  provided  that  a*  G  Aft)). 
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As  aforementioned,  the  parameter  ?/  plays  the  role  of  controlling  the  bandwidth  of 
the  AR  fdter.  Indeed,  the  closer  r]  is  to  1,  the  narrower  is  the  bandwidth  of  the  AR  filter. 
Since  the  sinusoidal  signal  under  investigation  concentrates  only  on  extremely  narrow 
bands  (spikes)  in  the  frequency  domain,  it  is  clear  that  in  order  to  enhance  the  signal  by 
the  AR  filter,  r/  should  be  chosen  as  close  to  1  as  possible.  From  another  point  of  view,  the 
parameter  7)  also  determines  the  asymptotic  behavior  of  the  associated  PFAR  estimator 
in  terms  of  its  covariance  matrix.  As  a  matter  of  fact,  the  asymptotic  covariance  matrix 
V„  of  the  PFAR  estimator  can  be  made  arbitrarily  small  if  7]  is  chosen  arbitrarily  close 
to  1.  More  precisely,  it  can  be  shown  that  V,*  tends  to  a  zero  matrix  at  the  rate  of  order 
(1  —  ?/)3  as  rj  — >  1. 

To  verify  this  assertion,  we  first  need  to  rewrite  toaj( a*)  defined  by  (3.8)  into  a  more 
suitable  form,  with  the  help  of  the  following  spectral  representation  of  the  autocovariance 
function  r^aP): 

To  do  so,  we  notice  that  (5.2)  can  be  rewritten  as 

2  q 

YJdkakrtT_k(oc*)  =  0  (t  =  1,2,...), 

k=0 

with  dk  defined  by 

ch  :=  Vk0k(c*)/ak  =  (rjk  +  v2q+k)/(i}k  +  V2q~k)- 

It  turns  out,  by  using  the  spectral  representation  of  F)(oP),  that 

SV  :=  J2akrr-kia*)  =  -dfc)afcr£r_fc(a*) 

k=0  k= 0 

O 


Introducing  the  following  polynomials 

D(z)  :=  £(1  -  dk)  ak  P{z)  :=  £  Ok{cf)  rf  z\ 

k= 0  k= 0 


f7 

2  7T 


[K  \m^cx*)\ 

J  —IT 


2  q 


X](l  -  dk)akc 


—ikuj 


'  du>. 


k= 0 
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and  Q(z)  :=  z2qP(z  *),  we  can  write  ST  as  a  Cauchy  integral  of  the  form 


?  =  f 
27 n  t,=1 


M-  z'-'  dz 


27 n  J\z\=\  P(z)  Q{z)  * 

for  any  r  >  1.  Note  that  a*  €  A(ij)  implies  0(a*)  €  0.  This,  in  turn,  guarantees 
that  all  of  the  2 q  zeros  of  Q(z )  appear  on  the  circle  \z\  =  t]  and  can  be  expressed  as 
Vj  :=  V  exp(iXj),  with  Aj  satisfying  0  <  Ax  <  •  •  •  <  A,  <  n  and  X2q-j+i  :=  -Aj  for 
j  =  1, . . . ,  q.  As  a  result,  we  can  write 

2q  2q 

Q(z)  -  Yl(z  ~  i^j)  and  ■P(z)  =  1[[{\-Vjz), 

7=1  j=l 

where  Pj  :=  rj  exp(-iXj).  It  follows  from  the  residue  theorem  of  complex  analysis  that 


9  _  jy'  Dji'k)  T_1  (r  =  i  9  ) 

,T  'EpM Q'M  *  ,2’"'  ' 


Using  this  formula,  Wij(cx*)  in  (3.8)  can  be  written  as 


c  A  o;d{h)d(v,)  4'4  '  ,,7) 

where  the  overline  denotes  the  complex  conjugation. 

Now,  with  this  expression,  the  behavior  of  W(c**)  as  rj  tends  to  1  become  easer  to 
investigate.  In  fact,  we  first  notice  that  vk  — >  zk  as  i)  — >  1.  Moreover,  it  is  easy  to  see 
from  the  definition  of  dj  that  d0  =  1  and 

(1  -rfy1  (1  -dj)  ->  j/2  (;  =  l,...,2g) 

as  i]  —>  1.  It  follows  that 


(1  -  r/2)  1  D{uk)  -+  5Z  \iuJ  zl 


Inr,  z2<l-j  —  _  I  ^-1  /!' 


i**M'(**), 


where  A(z)  is  the  polynomial  given  by  (1.6).  In  addition,  it  is  readily  shown  that 
(1  -  rj2)-1  P(vk)  =  ]/[(!  -  VjVk)  -►  :=  XI(1  “  ziZk) 
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and 


Q'(vk) = n(,y*  -  yi)  n(^ -  zi) = A>(zk)- 

j= i  j=i 

Finally,  it  is  easy  to  see  that  (1  —  77 2 ) / ( 1  —  7477)  =  1  for  /  =  A:,  and  (1  —  ?/2)/(  1  —  7477)  —*  0 
as  r]  — >  1  for  /  /  k.  Substituting  all  these  limits  in  (5.7)  yields 

(1  -  7 /2)  Wfi(0£*)  -»  <X4  X]  jpTjl  -TM"1 
as  77  — >  1.  Upon  noting  that  | J)- |  =  |A'(.24)|,  we  can  also  write 

(1  -  t?2)  QtW (a*)  Q  -4  4<r4  S0  with  S0  :=  QTSD0SHQ,  (5.9) 


where  D0  is  a  2r/-by-2<7  diagonal  matrix  of  the  form 

D°:=4dias{|4^jp’'''’PmjF} 

and  S  the  Vandermonde  matrix  in  (2.11). 

To  complete  the  analysis,  we  only  need  to  know  the  behavior  of  Rx(c**)  as  rj  tends 
to  1.  For  this  purpose,  we  notice  that  \H(u>k‘, cr*)|2  =  1/|Q(^)|2  and  Q{zk)  =  —  D(zk). 
It  follows  from  (5.8)  that 

(l-r/2)2|//(u;,;a*)l2->4/|/l'(^)|2. 


According  to  (2.10),  this  implies  that 


(1  -7/2)2Rx.(cV)  ->  Sr££  with  £  :=  QTSDS/7Q. 


(5.10) 


In  this  expression, 

D:=4d,ag{il^’'''’ 

°k  ’=  fill  £j=i  P],  and  alq-k+ 1  :=  ah  (k  =  F- ••,</)•  Introducing  the  notation  7  := 
ro/ro  =  rol<7l  as  being  the  signal-to-noise  ratio  of  the  data,  and  substituting  all  these 
limits  in  the  expression  of  Va,  given  by  (3.12),  we  finally  obtain 


lim  (f±A)  V. 

r;-^l  \  1  —  7 f  J 


(5.11) 
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In  particular,  when  the  q  sinusoidal  compoments  have  the  same  power,  that  is,  0k  =  0i, 
(5.11)  reduces  to 

*3  (£5)^*  =  £=.-*•  (5.12) 

since  in  this  case  D  =  q-1  D0. 

REMARK.  It  is  interesting  to  recognize,  by  comparing  (5.9)  with  (2.10),  that  X0  is 
identical  to  the  autocovariance  matrix  of  a  sinusoidal  signal  having  the  same  q  frequencies 
as  {.tJ,  with  the  amplitude  corresponding  to  u>k  being  equal  to  1/|/-1'(^)|.  Similarly,  X 
in  (5.10)  has  the  same  structure  as  So,  but  the  amplitude  associated  with  Uk  is  replaced 
by  &k/\A'(zk)\.  Therefore,  these  matrices  can  be  rewritten  as 


y  [y-  cos  {Mi- j)} 

°  [t,  2\*{zk)\* 

where  i,  j  =  1, . . . ,  2q  —  1. 


and  S  — 


'  *  cos{o;r.(f  -  j)} 

hx  2\A'(zk)\2  . 


5.4  Accuracy  of  the  PFAR  Estimator 

As  guaranteed  by  Theorem  3.3,  the  PF  method  produces  an  estimator  a  so  that  y/n(a  — 
a*)  is  asymptotically  normally  distributed  and  thus  its  estimation  accuracy  is  of  order 
0(n-1/2).  Upon  using  the  AR  filter,  it  has  been  shown  that  the  asymptotic  covariance 
matrix  of  the  resulting  PF  estimator  (i.e.,  the  PFAR  estimator)  can  be  made  arbitrarily 
small  as  ?/  tends  to  1.  This  indicates  that  a  higher  order  accuracy  could  be  obtained 
with  q  ps  1.  Indeed,  as  discussed  in  Quinn  and  Fernandes  (1991),  Truong- Van  (L990), 
and  Li,  Kedem,  Yakowitz  (1991)  for  the  case  of  a  single  sinusoid,  the  PFAR  estimator  is 
able  to  achieve  the  same  accuracy  of  order  0(n~3/2)  as  the  nonlinear  least  squares  (NLS) 
approach  (Hannan,  1973;  Stoica  and  Nehorai,  1989;  Walker,  1971)  in  the  limiting  case  of 
ri  =  1. 

An  advantage  of  the  PF  method  over  the  NLS  lies  in  its  computational  simplicity 
inherited  from  the  explicit  LS  solution  of  a(a).  Another  advantage  of  the  PF  method  is 
its  less  stringent  requirement  of  the  initial  estimates.  As  pointed  out  by  many  researchers 

(Rice  and  Rosenblatt,  1988;  Stoica,  et  al.,  1989;  Li,  Kedem,  and  Yakowitz,  1991),  the 
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NLS  approach,  as  well  as  the  PFAR  estimator  in  the  limiting  case  of  rj  =  1,  requires  an 
initial  estimate  of  accuracy  o(n-1),  in  order  to  obtain  the  optimal  solution  by  iterative 
procedures.  On  the  other  hand,  an  initial  estimate  of  accuracy  0(1)  is  sufficient  for 
the  calculation  of  the  PFAR  estimator  with  ??  <  1  using  FPI  in  (2.6),  as  indicated  by 
Theorem  3.1  and  also  verified  by  simulations  of  Li,  Kedem,  and  Yakowitz  (1991).  More 
importantly,  thanks  to  the  flexibility  for  the  choice  of  7/,  the  estimation  accuracy  of 
the  PFAR  estimator  can  be  improved  significantly  upon  using  a  monotone  increasing 
sequence  of  i /  such  that  0  <  r/i  <  i]2  <•••—>  1. 

According  to  the  suggestions  of  Li,  Kedem,  and  Yakowitz  (1991),  the  iteration  in  (2.6) 
can  be  carried  out  with  ?/  =  7/7.  until  convergence,  using  the  PFAR.  estimator  previously 
obtained  with  ?/  =  rjk-i  as  the  initial  value.  In  so  doing  for  k  =  1,2,...,  a  sequence 
of  PFAR  estimators  associated  with  {77^}  is  produced,  each  estimator  in  the  sequence 
serving  as  the  initial  value  of  its  successor.  As  k  grows,  the  accuracy  improves  without 
requiring  a  stringent  initial  estimate.  Another  possible  way  of  improving  the  estimation 
accuracy  is  to  increase  ?/  after  each  iteration  rather  than  carrying  on  the  iteration  until 
convergence  (see,  c.g.,  Dragosevic  and  Stankovic,  1989;  Kedem  and  Yakowitz,  1991). 
This  strategy  simplifies  the  computation  but  may  result  in  converging  to  a  false  location 
if  7/  is  increased  too  fast. 

In  the  most  interesting  case  of  closely-spaced  frequencies,  however,  one  should  be 
very  cautious  when  increasing  ?/.  Simulations  show  (see  next  section)  that  the  bias  of  the 
PF  estimator  from  finite  data  increases  as  7/  approaches  1,  while  the  variance  decreases. 
For  closely-spaced  frequencies,  the  bias  eventually  dominates  the  variance,  and  hence  an 
appropriate  7/  <  1  should  be  used  to  balance  the  trade-off  between  the  bias  and  variance 
in  minimizing  the  mean-squared  error. 

For  the  case  of  multiple  sinusoids,  the  fixed-point  iteration  in  (2.6)  with  77  «  1  can  be 
shown  to  be  an  algorithm  that  approximately  calculates  the  NLS  estimator  in  an  iterative 
fashion.  In  fact,  since  the  NLS  estimator  minimizes  the  sum  of  squared  errors 

n  f  g  a  'j  2 

j  :=  J2  1  yt  -  ^(AkCosukt  +  Bksmukt)  \  (5.13) 

7=1  l  k=l  J 
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with  respect  to  Ak,  Bk,  and  a>/.,  it  can  be  implemented  in  two  steps,  similar  to  the  proce¬ 
dure  of  Bresler  and  Macovski  (1986)  for  complex  sinusoids.  The  first  step  is  to  minimize 
J  with  respect  to  [Ai,  •  •  •  ,  Aq ,  B\ ,  •  •  •  ,  Bq]J ,  while  keeping  u>k  fixed.  It  is  not  difficult  to 
verify  that  the  optimal  vector  is  given  by  ( GTG)~1GTy ,  where  y  :=  \y i,  •  •  •  ,yn]T  and 


G  :  = 


COS  Cji 


cos  ujq  sin  Cj\ 


^  cos  mb i  •  •  •  cos  nCbq  sin  nb\ 
Substituting  this  optimal  vector  in  (5.13)  gives 


sind>g 


sin  nujq 


J  =  || {I  -  G  (GTG)-1GT}y||2  =  yT{I  -  G  (GTG)~1GT}y. 


This  quantity,  in  turn,  is  minimized  with  respect  to  u>k  in  the  second  step.  Notice  that 
I  —  G  (G1  G)_1Gr  is  a  projection  operator  that  projects  an  n-vector  onto  the  orthogonal 
complement  of  the  2^-dimensional  column-space  of  G.  On  the  other  hand,  if  we  let 
hj  be  the  AR  parameters  determined  by  (1.6)  for  any  given  u>k,  and  denote  by  A  the 
corresponding  n-by-(n  —  2 q)  matrix  of  the  structure  (5.18),  it  is  easy  to  verify  that 
ArG  =  0.  This  implies  that  the  n  —  2q  linearly-independent  columns  of  A  are  orthogonal 
to  the  columns  of  G  and  thus  span  the  (n  —  2<f)-dimensional  orthogonal  complement  of 
the  column-space  of  G.  As  a  consequence,  we  obtain 

I  -  G  (GtG)-xGt  =  A  (ArA)-1  At 

and  hence  J  =  yTA  (A1  A)-1  A7  y.  Furthermore,  let  e  :=  [e2(?+ 1,  ■  •  •  ,  cn]7  ,  where 

2  q 

:=  5D  »*--»*  (5-14) 

3= 0 

Simple  algebra  shows  that  AT y  =  e.  Therefore,  J  can  be  rewritten  as 

J  =  e7'(ATA)-1e.  (5.15) 

The  NLS  is  thus  reduced  to  the  problem  of  minimizing  J  in  (5.15)  with  respect  to  aj.  To 
compute  the  NLS  estimator,  an  iterative  procedure  can  be  employed  in  accordance  with 
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the  suggestions  of  Bresler  and  Macovski  (1986).  Indeed,  for  any  estimate  a m,  a  matrix 
Am  can  be  constructed  accordingly  and  a  new  estimate  can  be  taken  as  the  minimizer 
of  the  criterion  Jm  :=  eT(A^Am)_1e  with  respect  to  a.j. 

Suppose  that  am  :=  [a^,  •  •  •  ,  appears  inside  a  tiny  neighborhood  of  the  AR 

parameter  a  that  corresponds  to  the  AR  model  of  the  signal.  It  is  easy  to  see  that  within 
such  a  neighborhood  et  can  be  approximated  by 

2q  2  q  2  q 

V  ^  ~(m)  ,  \~>  *(m)  \  '  «(m) 

e<  =  2^  aj  xt~3  +  L  a)  tt-i  et~r 

j— o  j—0  j= 0 

Since  {et}  is  white,  it  can  be  shown  from  this  approximation  that  the  covariance  matrix  of 
e  is  approximately  equal  to  cr^AjnAm.  Therefore,  em  :=  (A^Am)"R2e  can  be  regarded  as 
a  whitening  procedure  that  decorrelates  the  vector  e.  Resorting  to  the  frequency-domain 
interpretation,  this  whitening  procedure  can  be  approximately  performed  by  applying  on 
et  an  AR  filter  of  the  form  (5.1)  with  i]  m  1  and  am.  Let  {ejm))  be  the  output  of  the 
filter,  then  em  1  and  hence  Jm  =  ||em||2  I^(e[m*)2.  On  the  other 

hand,  by  interchanging  the  order  of  the  AR  filtering  on  et  and  the  FIR  filtering  on  yt 
defined  by  (5.14),  we  obtain 

2<j 

etm)  ~  E«i^-i(am)- 

j=0 

Thus,  minimizing  Jm  is  approximately  equivalent  to  minimizing  ■, 

which  yields  the  PFAR  estimator  aTO+1  produced  by  FPI  in  (2.6). 

The  above  discussion  indicates  that  for  multiple  sinusoids  the  PFAR  estimator  with 
i]  tending  1  is  also  capable  of  approaching  the  accuracy  of  the  NLS  procedure,  just  like 
the  case  of  a  single  sinusoid  as  discussed  by  Li,  Kedem,  and  Yakowitz  (1991). 

5.5  The  Case  of  Two  Sinusoids 

The  simplest  case  of  the  PFAR  estimator  for  a  single  sinusoid  is  similar  to  the  CM 
estimator  discussed  by  Li  and  Kedem  (1991)  and  Li,  Kedem,  and  Yakowitz  (1991).  From 
their  results,  it  is  easy  to  verify  that  for  q  —  1  the  asymptotic  variance  V0  of  the  PFAR 
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estimator  a  can  be  expressed  exactly  by  (5.12)  without  taking  the  limit,  that  is, 

4sin2o;i 


V, 


=  ( Lz£) 
vi +W 


7 


The  requirement  that  a*  6  *4(7/)  reduces  to  |  cosa>i|  <  +  7/2).  This  implies  that  U\ 

should  stay  away  from  0  and  7 r  at  least  by  an  amount  of  arccos{2?//(l  +  7/2)}. 

Let  us  now  consider  specifically  the  case  of  two  sinusoids  ( q  =  2)  in  additive  white 
noise.  In  this  case,  0  consists  of  all  6  =  [tfi ,  02] T  with 


01 


-2(cos  Ai  +  cos  A2)  and  O2  —  2(1  +  2  cos  Ai  cos  A2) 


(5.16) 


for  some  0  <  Ai  <  A2  <  7r.  Simple  algebra  shows  that  cos  Aa  and  cosA2  must  be  of  the 
form  {— 0\  ±  ( 6 2  —  40?  +  8)1//2}/4.  Therefore,  0\  and  02  should  satisfy 

A  ±  yj0\  -  4 02  +  8  <  1  and  01  -  402  +  8  >  0. 


It  turns  out  by  solving  these  inequalities  that  0  is  the  region  defined  by 


0:  2\0i\-2<02<\01  +  2. 


Moreover,  for  q  —  2,  (5.5)  reduces  to 

n  t  A  1  +  V4  j  n  /  \  1  +  V4 

=  and  02(a)  =  -^a2. 

According  to  (5.17),  the  parameter  space  A(y)  is  given  by 


(5.17) 


A{v)  : 


4  V 


l«i| 


4r/2  1  +  if  o  4  iiz 

<  «2  <  7TT-,  I  ~ov>  <A  + 


(5.18) 


1  +72  1  +  r/4  2(1  +?/2)2  1  1  +  V4 

Figure  1  shows  A(rj)  for  i]  =  0.8  together  with  0  defined  by  (5.17). 

It  is  readily  seen  from  Figure  1,  as  well  as  (5.18)  and  (5.5),  that  A(rj)  is  contained  in 
0  for  77  <  1 ,  and  that  A(r])  coincides  with  0  as  r)  — >  1.  Therefore,  for  any  given  uq  and 
lo2  satisfying  0  <  u>\  <  o;2  <  7T,  the  requirement  ol*  €  A(i])  in  Theorem  5.1  can  always 
be  met  by  choosing  7/  close  enough  to  1. 

On  the  other  hand,  for  a  given  7/  <  1,  the  requirement  a*  €  *4(7/)  imposes  a  separation 
condition  on  the  frequencies  of  the  signal.  As  a  matter  of  fact,  in  order  that  a*  €  A{r] ), 
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Figure  1:  Parameter  space  A(y)  with  y  =  0.8  for  the  case  of  two  sinusoids.  The  dotted 
lines  define  the  region  0. 

the  frequencies  uq  and  should  stay  away  from  0  and  n,  respectively,  and  from  each 
other,  by  a  certain  amount  depending  on  y.  It  is  not  too  difficult  to  verify  that  a  sufficient 
condition  for  (5.18)  to  be  fulfilled  is  that 

/  [+v2  I 

co>i ,  7T  —  u>2  >  arccos  <  ——======  >  and  u>2  —  uq  >  arccos 

l  \/2(l  +  t)  I 

To  get  a  complete  picture  of  the  separation  condition,  Figure  2  shows  for  y  —  0.8  and 
0.9  the  set  referred  to  as  the  frequency  space,  of  all  frequencies  for  which  a*  £  A(y). 

Let  us  now  consider  the  characteristic  of  the  AR  filter  at  a  =  «*,  which  determines 
the  asymptotic  behavior  of  the  PFAR  estimator,  as  indicated  by  Theorem  3.3.  Let  us 
first  look  at  the  squared  gain  function  \H(u;ac*)\2.  In  Figure  3,  a*)\2  is  plotted 

for  y  =  0.92,  where  oc*  =  a  is  the  AR  parameter  corresponding  to  the  sinusoidal  signal 
with  tui  =  0.327T  and  —  0.457T.  It  is  interesting  to  observe  that  the  squared  gain 
function  has  peaks  near,  but  not  exactly  at,  the  frequencies  of  the  signal.  Therefore,  the 
PFAR  method  can  eventually  enhance  the  signal,  while  making  the  estimator  consistent 
by  slightly  biased  peaks.  A  further  illustration  of  this  point  is  presented  in  Figure  4, 
where  the  poles  of  the  AR  filter  are  plotted  against  the  frequencies  of  the  signal.  It  can 
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Figure  2:  Frequency  space  of  normalized  frequency  /  :=  uj/tt  for  r)  —  0.8  (smallest)  and 
0.9.  The  region  defined  by  dotted  lines  corresponds  to  the  extreme  case  of  rj  =  1. 

be  seen  that  the  PFAR  method  does  not  in  general  force  the  poles  of  the  AR  filter  to 
have  the  same  angular  frequencies  as  those  of  the  signal.  Instead,  a  slightly  biased  poles 
are  used  on  the  basis  of  (5.5)  in  order  to  produce  a  consistent  estimator,  as  guaranteed 
by  Theorem  5.1. 

6  Experimental  Results 

In  order  to  illustrate  the  performance  of  the  PF  method,  we  employ  the  all-pole  filter 
(5.1)  and  consider  the  case  of  two  sinusoids  ( q  =  2)  in  Gaussian  white  noise.  All  of 
the  simulations  in  this  section  are  based  on  100  independent  realizations  of  {yt}  with  a 
relatively  short  length  of  n  =  100.  The  phases  of  the  sinusoids  arc  fixed  at  zero,  and 
the  sample  variance  of  the  noise  is  adjusted  in  each  realization  according  to  the  sample 
variance  of  the  signal  in  order  to  achieve  the  required  signal-to-noise  ratio.  Furthermore, 
in  both  PF  and  GLS  —  which  correspond  to  the  parametrizations  (5.5)  and  (5.6)  respec¬ 
tively  —  the  poles  of  the  all-pole  filter  are  constrained  to  be  on  the  circle  \z\  =  ?/,  by 
projection  if  necessary,  so  that  the  bandwidth  parameter  i]  effectively  controls  the  band- 
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Figure  3:  Plot  of  squared  gain  with  r;  =  0.92  and  a  =  a*  for  the  case  of  two  sinusoids 
with  lo\  =  0.327T  and  u>2  =  0.457T. 


Figure  4:  Location  of  poles  of  the  AR  filter  shown  in  Figure  3.  The  dotted  lines  indicate 
frequencies  of  the  signal  on  the  unit  circle. 


Table  1:  PF  fe  GLS  Estimates  for  Well- Separated  Frequencies 


1 

PF 

GLS 

mse 

bias 

var 

complexity 

mse 

bias 

var 

complexity 

Prony 

4.72e-3 

4.40e-3 

3.23e-4 

— 

4.72e-3 

4.40e-3 

3.23e-4 

— 

0.950 

2.20e-6 

1.17e-8 

2.19e-6 

00 

o 

H- 

0.6 

2.34e-6 

1.48e-7 

2.19e-6 

8.1  ±0.8 

0.960 

1.96e-6 

9.14e-9 

1.96e-6 

3.1  ± 

0.2 

2.04e-6 

9.47e-8 

1.95e-6 

3.2  ±0.4 

0.970 

1.79e-6 

9.72e-9 

1.78e-6 

3.2  ± 

0.3 

1.84e-6 

6.21e-8 

1.77e-6 

3.3  ±0.4 

0.980 

1.67e-6 

1.64e-8 

1.65e-6 

3.3  ± 

0.4 

1.69e-6 

4.96e-8 

1.65e-6 

3.4  ±0.4 

0.985 

1.63e-6 

2.39e-8 

1.60e-6 

2.9  ± 

0.4 

1.65e-6 

4.79e-8 

1.60e-6 

2.8  ±0.4 

0.990 

1.59e-6 

3.60e-8 

1.55e-6 

3.0  ± 

0.5 

1.60e-6 

5.08e-8 

1.55e-6 

2.7  ±0.4 

0.995 

1.55e-6 

5.53e-8 

1.49c-6 

3.4  ± 

0.4 

1.55e-6 

6.08e-8 

1.49e-6 

2.9  ±0.6 

1.000 

1.50e-6 

8.27e-8 

1.42e-6 

3.7  ± 

0.6 

1.51e-6 

8.29e-8 

1.42e-6 

3.4  ±0.5 

width  of  the  all-pole  filter  and  the  performance  of  the  estimators.  For  convenience,  the 
following  simulation  results  are  given  in  regard  to  the  normalized  frequencies  fk  :=  oJk/n, 
and  the  average  mean-squared  error 

mse  ~\{E{h  ~  fx)2  +  E(f2  -  f2)2} 

is  employed  as  an  overall  performance  index.  Moreover,  we  define  the  average  bias  and 
average  variance  of  the  frequency  estimates  by 

bias  :=  \{{E(j\)  -  ,/i)2  +  (E(f2)  -  /2)2}  and  var  :=  |{var(/i)  +  var(/2)} 

respectively.  The  frequency  estimates  of  both  PF  and  GLS  are  obtained  by  the  fixed- 
point  iteration  (2.6)  in  connection  with  (5.16)  which  provides  the  relationship  between 
the  frequency  and  AR  estimates.  The  stopping  rule  of  the  iteration  is  given  by 

In  other  words,  the  iteration  terminates  at  the  mth  iteration  il  this  inequality  is  satisfied. 
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Table  2:  PF  fc  GLS  Estimates  for  Closely-Spaced  Frequencies 


V 

PF 

GLS 

mse 

bias 

var 

complexity 

mse 

bias 

var 

complexity 

Prony 

1.46e-2 

1.40e-2 

5.68e-4 

— 

1.46e-2 

1.40e-2 

5.68e-4 

— 

0.950 

1.81e-6 

2.69e-8 

1.78e-6 

10.6  ±3.4 

4.68e-6 

3.05e-6 

1.63e-6 

11.0  ± 

5.3 

0.960 

1.67e-6 

6.20e-8 

1.61e-6 

3.3  ±0.4 

3.63e-6 

2.17e-6 

1.46e-6 

3.8  ± 

0.2 

0.970 

1.63e-6 

1.63e-7 

1.46e-6 

3.8  ±0.3 

2.95e-6 

1.59e-6 

1.36e-6 

3.9  ± 

0.2 

0.980 

1.75e-6 

4.21e-7 

1.32e-6 

4.3  ±0.3 

2.56e-6 

1.29e-6 

1.27e-6 

3.7  ± 

0.4 

0.985 

1.90e-6 

6.57e-7 

1.25e-6 

4.2  ±0.4 

2.46e-6 

1.25e-6 

1.21e-6 

c© 

0.4 

0.990 

2.15e-6 

9.83e-7 

1.16e-6 

4.4  ±0.3 

2.45e-6 

1.31e-6 

1.15e-6 

3.0  ± 

0.5 

0.995 

2.47e-6 

1.41e-6 

1.06e-6 

4.5  ±0.5 

2.56e-6 

1.50e-6 

1.06e-6 

3.7  ± 

0.7 

1.000 

2.85e-6 

1.91c-6 

9.45e-7 

4.5  ±0.8 

2.86e-6 

1.91e-6 

9.46e-7 

4.3  ± 

0.8 

In  our  simulations,  we  first  compare  the  performance  of  PF  and  GLS  in  the  cases  of 
well-separated  and  closely-spaced  frequencies.  In  both  cases  the  SNR  is  fixed  at  0  dB 
per  sinusoid,  while  the  bandwidth  parameter  ?;  in  the  all-pole  filter  (5.1)  takes  different 
values.  Since  r]  varies,  it  is  convenient  to  explicitly  write  the  corresponding  frequency 
estimates  (/i(?/),  hiv))  as  functions  of  ?/.  Table  1  and  Table  2  present  some  statistics 
of  the  frequency  estimates  for  eight  ascending  values  of  77,  that  is,  771  =  0.95,  772  = 
0.96, ...,  778  =  1.  The  mean  and  variance  of  the  stopping  time  m  are  also  given  as 
“complexity”  in  the  form  of  “mean  ±  variance”. 

Notice  that  in  the  fixed-point  iteration  (2.6)  two  things  are  needed:  a  value  of  the 
bandwidth  parameter  77  and  an  initial  guess  of  the  AR  parameter  a.  In  both  PF  and 
GLS,  we  use  Prony’s  estimator  aLs  as  the  initial  guess  of  a  corresponding  the  first  value 
of  77,  i.e.,  77  =  771  =  0.95.  When  the  iteration  terminates,  the  resulting  AR  estimate, 
denoted  by  a^),  is  used  not  only  to  obtain  the  frequency  estimates  ( _/i (771 ) ,  _/2 (^?i ) ) ,  but 
also  to  initiate  the  iteration  for  the  next  value  of  77,  namely,  77  =  772  =  0.96.  In  general, 
as  77  grows  from  rj2  to  773,  we  employ  the  previous  AR  estimate  a^—i)  to  initiate  the 
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iteration  (2.6)  corresponding  to  77  —  ?/t  in  the  computation  of  a (?/;.). 

In  Table  1,  the  true  frequencies  were  taken  to  be  (/i ,  ^2)  =  (0.41,0.59),  which  are 
well-separated  as  compared  to  the  width  of  a  Fourier  bin  A /  =  2/n  =  0.02  in  our 
simulations.  As  we  can  see,  Prony’s  estimator  gives  poor  frequency  estimates,  while  both 
PF  and  GLS  significantly  improve  Prony’s  estimator  in  terms  of  mean-squared  error,  even 
with  a  relatively  small  77.  Moreover,  the  estimation  accuracy  can  be  further  improved 
by  increasing  7/  toward  1.  This  is  quite  intuitive  because  increasing  7/  is  equivalent  to 
shrinking  the  bandwidth  of  the  all-pole  filter  and  thereby  enhancing  the  sinusoids  (see  also 
Dragosevic  and  Stankovic,  1989;  Kedem  and  Yakowitz,  1991)  for  similar  ideas  of  shrinking 
bandwidth  in  frequency  estimation).  The  issue  is  that  the  shrinkage  should  be  based  upon 
reliable  frequency  estimates,  since  otherwise  the  filter  may  lock  on  false  locations  (an 
example  in  this  regard  will  be  presented  later).  This  is  the  reason  why  in  our  simulations 
the  iteration  for  a  higher  value  of  7/  was  initiated  by  the  estimates  corresponding  to 
a  slightly  smaller  value  of  77.  I11  this  way,  we  gradually  shrink  the  bandwidth  as  more 
reliable  estimates  become  available.  Table  1  shows  that  as  77  approaches  1  the  PF  and  GLS 
estimates  achieve  a  precision  (mse)  of  1.50  x  1CT6  —  very  close  to  the  asymptotic  variance 
of  the  nonlinear  least  squares  (NLS)  estimator  which  in  this  case  equals  1.22  x  10~6. 
Thus,  for  well-separated  frequencies,  PF  and  GLS  have  the  same  final  performance  as  77 
approaches  1. 

For  closely-spaced  frequencies,  however,  the  PF  estimator  performs  better  than  GLS, 
as  can  be  seen  from  Table  2  and  Figure  5.  In  this  experiment,  the  true  frequencies  were 
taken  to  be  (/i,/2)  =  (0.47,0.51)  while  all  other  conditions  remained  the  same  as  in  the 
previous  example.  Notice  that  the  true  frequencies  are  separated  only  by  two  Fourier 
bins.  It  is  interesting  to  observe  that  as  the  bandwidth  parameter  77  increases  toward  1 
the  mse  of  both  methods  does  not  monotonically  decrease  as  in  the  case  of  well-separated 
frequencies.  Instead,  it  starts  increasing  after  a  certain  value  of  77  (see  also  Figure  5). 
The  reason  is  the  following.  A  closer  examination  of  Table  1  and  Table  2  reveals  that 
as  77  approaches  1  the  bias  increases  while  the  variance  decreases  in  both  methods.  For 
well-separated  frequencies  as  shown  in  Table  1,  the  bias  never  dominates  the  variance, 
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Figure  5:  Plot  of  mse  (xlO  6)  against  77  for  closcly-spaced  frequencies. 


Table  3:  Estimation  With  ij  =  1 


(fnfl) 

mse 

£(/i)±v  ar(/0 

£(/2)±var(/2) 

complexity 

(0.41,0.59) 

4.33e-6 

0.409704  ±  1.30e-6 

0.590529  ±6.99e-6 

17.9  ±10.7 

(0.47,0.51) 

1.38e-4 

0.468628  ±9.37e-  7 

0.513672  ±2.49e-4 

24.3  ±63.3 

and  hence  the  mse  decreases  basically  along  with  the  decrease  of  the  variance.  On  the 
other  hand,  the  bias  becomes  dominant  in  the  case  of  closely-spaced  frequencies  as  77 
approaches  1  (see  Table  2),  and  a  trade-off  effect  between  bias  and  variance  takes  place. 
As  we  can  see  from  Figure  5,  the  best  value  of  77  for  the  PF  estimator  lies  between  0.96 
and  0.98  where  the  mse  achieves  the  smallest  values.  The  GLS  estimator  is  clearly  inferior 
to  the  PF  estimator  in  this  example  because  of  its  relatively  higher  bias.  Indeed,  the  bias 
and  variance  of  the  GLS  estimator  play  an  equal  role  in  the  mse,  since  their  magnitudes 
are  of  the  same  order  (see  Table  2). 

Table  2  illustrates  the  role  of  77  as  a  parameter  that  can  be  utilized  to  balance  the 
bias  and  variance  of  the  PF  estimator  for  minimizing  the  mean-squared  error.  Now,  in 
Table  3,  we  illustrate  the  role  of  ?/  in  the  convergence  of  the  fixed-point  iteration  (2.6) 
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Figure  6:  Plot  of  — log(mse)  against  SNR  in  dB  for  very  closely-spaced  frequencies.  The 
dotted  line  indicates  the  asymptotic  variance  of  NLS. 

when  initial  guesses  are  poor.  Instead  of  gradually  increasing  rj  toward  1,  as  done  in 
Table  1  and  Table  2,  the  frequency  estimates  in  Table  3  were  obtained  right  away  with 
T)  —  1,  using  Prony’s  estimator  as  the  initial  guess.  This  is  equivalent  to  the  iterative 
procedure  that  employs  the  all-pole  fdter  without  rj  and  starts  with  Prony’s  estimator. 
As  can  be  seen  from  Table  3,  the  mean-squared  error  is  higher  than  the  mse  reported  in 
Table  1  and  Table  2  corresponding  to  (gradually  achieved)  rj  —  1,  especially  for  closely- 
spaced  frequencies.  This  indicates  that  without  rj  in  the  all-pole  filter  the  iteration  (2.6) 
may  fail  to  converge  to  the  desired  fixed-point  when  poor  initial  guesses,  such  as  Prony’s 
estimator,  are  used.  The  reason  is  that  the  bandwidth  of  the  all-pole  filter  without  rj 
(or,  equivalently,  with  rj  =  1)  is  extremely  narrow.  Although  it  could  be  helpful  to  have 
a  narrow  bandwidth  for  enhancement  of  the  sinusoids  if  good  initial  guesses  are  used, 
a  narrow  bandwidth  might  cause  the  filter  to  “loose”  the  true  frequencies  when  tuned 
according  to  inaccurate  estimates.  Therefore,  the  safest  way  is  to  start  with  a  relatively 
small  rj,  to  accommodate  even  poor  initial  guesses,  and  then  gradually  increase  rj  as 
improved  estimates  from  previous  iterations  become  available. 

To  show  the  performance  of  the  PF  estimator  under  different  signal-to-noise  ratios, 
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Figure  6  presents  the  negative  logarithm  of  the  mse  for  various  values  of  SNR,  with  the 
dotted  line  indicating  the  asymptotic  variance  of  NLS  as  a  reference.  In  this  example, 
the  frequencies  were  taken  to  be  (/i,  /2)  =  (0.485, 0.495),  which  are  closely-spaced  within 
a  Fourier  bin,  and  the  bandwidth  parameter  T)  was  fixed  at  0.985  in  both  PF  and  GLS. 
Prony’s  estimator  again  was  used  to  initiate  the  fixed-point  iteration  (2.6)  for  both  meth¬ 
ods.  As  can  be  seen,  the  mse  of  the  PF  estimator  closely  follows  the  asymptotic  variance 
of  NLS  when  SNR  >  3  dB,  and  the  performance  of  both  PF  and  GLS  deteriorates  rapidly 
when  the  SNR  is  below  this  threshold.  The  poor  initial  accuracy  of  Prony’s  estimator  is 
largely  responsible  for  this  particular  value  of  threshold.  In  fact,  simulations  show  that 
the  threshold  can  be  extended  to  —2  dB  if  the  initial  guesses  are  taken  to  be  the  two 
Fourier  frequencies  which  correspond  to  the  largest  absolute  values  in  the  FFT  of  the 
data. 

In  Figure  7,  we  show  that  the  PF  estimator  is  capable  of  resolving  very  closely- 
spaced  frequencies  where  the  GLS  method  fails.  In  this  experiment,  the  true  frequencies, 
(fufz)  =  (0.41,0.412),  are  only  10%  apart  relative  to  the  width  of  a  Fourier  bin.  The 
frequency  estimates  were  obtained  with  t]  =  0.997  and  the  fixed-point  iteration  (2.6)  was 
initiated  by  Prony’s  estimator.  Figure  7(a)  shows  the  negative  logarithm  of  the  mse  for 
different  values  of  SNR  and  Figure  7(b)  presents  the  averages  of  the  frequency  estimates. 
Compared  to  the  GLS  estimator,  the  PF  estimator  has  a  much  smaller  bias  which  enables 
it  to  resolve  the  frequencies  as  well  as  to  achieve  a  smaller  mean-squared  error.  Notice 
that  the  GLS  estimator  gives  essentially  a  single  frequency  /  ps  0.411  between  the  two 
true  frequencies. 

Finally,  we  note  that  in  the  preceding  discussion  the  phases  were  fixed  at  zero.  Expe¬ 
rience  shows,  however,  that  when  the  phases  are  chosen  at  random  the  mse  may  worsen 
somewhat.  This  is  understandable  due  to  the  small  sample  size  which  cannot  explain  the 
addition  of  extra  sources  of  variability  (Kay  and  Marple,  1981). 
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